UNIVERSITY OF WEST BOHEMIA IN PILSEN DEPARTMENT OF CYBERNETIC Optimization of Features for Robust Speaker Recognition
نویسندگان
چکیده
Currently, the old feature extraction method, which was used early for speech recognition, is used in speaker recognition in our speaker recognition group. Standard Mell Frequency Cepstral Coefficients (MFCC) features are used. They can be extended by delta and acceleration coefficients eventually. Whereas features for speech recognition has been evolved and optimized until now, features for speaker recognition remains same. These outdated features suffer from various deficiencies, regarding low robustness in particular. It can be said that these features are unsuitable in practical application. This study is aimed to examine possibilities of improving of the features. In conclusion then came up with suggestion of appropriate features extraction technique, which have been combined from examined method on the basis of the before explored methods. Main emphasis is placed on the robustness, i.e. noisy test data and/or channel disturbances (e.g. microphone mismatch). The study can be divided into several parts. At first, standard MFCC and Perceptual Linear Prediction (PLP) feature sets were optimized, i.e. the optimal numbers of the band filters and of the cepstral coefficients were examined. Next, the influence of delta and acceleration coefficients was discussed. Then, the channel normalization techniques were employed. Next, the possibilities of the linear transformations Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) were investigated. Then, the smoothing of spectrum or cepstrum in time was examined. Finally, several proposed combinations of above described approaches were tested. The new proposed features allow us to decrease the recognition error rate by 35-50%.
منابع مشابه
Aspects of Sentiment Analysis
This report introduces the task of sentiment analysis, describes the core problems and presents the formal definition of sentiment analysis. The basic machine learning algorithms for text classification are described as well as the most commonly used features for sentiment analysis. Brief overview of distributional semantics is presented. Related work and the state-of-the-art approaches to sent...
متن کاملUNIVERSITY OF WEST BOHEMIA IN PILSEN, DEPARTMENT OF CYBERNETICS A Method for Speaker-Based Segmentation of Audio Signals
The paper deals with the problem of speaker-based segmentation. The goal of this task is to extract homogeneous segments containing the longest possible utterances produced by a single speaker. In the method presented here, no assumption is made about prior knowledge of the speaker or speech signal characteristics (there is no speaker model, no speech model, even the number of speakers in the r...
متن کاملContext-dependent ASR
Computer speech recognition gains more and more attention these days with its implementation in nearly everyday life. But the ultimate goal is still out of reach. The automatic recognition (ASR) systems can very precisely work on small domain. However the bigger the domain is the worse is the performance of the ASR system. The aim of many researchers is to diminish this problem on various level...
متن کاملHologram Synthesis by use of Patterns
The report describes a simple method for synthesising a hologram of line segments. The method is based on the diffraction pattern splatting. Optical verification of the results is included. This work has been partialy supported by the Ministry of Education, Youth and Sports of the Czech Republic under the research program LC-06008 (Center for Computer Graphics). This work has been partialy supp...
متن کاملWeb Mining and Its Applications to Researchers Support
Web mining is a newly emerging research area concerned with analyzing the World Wide Web. It is concerned mainly with its content, structure and usage. As the Web is the largest storehouse of knowledge of our time it has become essential to learn how to exploit its potential. This report aims at presenting Web mining in the context of other scientific domains and it introduces its possible appl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005